A Realizable Learning Task which Exhibits Overfitting
نویسنده
چکیده
In this paper we examine a perceptron learning task. The task is realizable since it is provided by another perceptron with identical architecture. Both perceptrons have nonlinear sigmoid output functions. The gain of the output function determines the level of nonlinearity of the learning task. It is observed that a high level of nonlinearity leads to overfitting. We give an explanation for this rather surprising observation and develop a method to avoid the overfitting. This method has two possible interpretations, one is learning with noise, the other cross-validated early stopping. 1 Learning Rules from Examples The property which makes feedforward neural nets interesting for many practical applications is their ability to approximate functions, which are given only by examples. Feed-forward networks with at least one hidden layer of nonlinear units are able to approximate each continuous function on a N-dimensional hypercube arbitrarily well. While the existence of neural function approximators is already established, there is still a lack of knowledge about their practical realizations. Also major problems, which complicate a good realization, like overfitting, need a better understanding. In this work we study overfitting in a one-layer percept ron model. The model allows a good theoretical description while it exhibits already a qualitatively similar behavior as the multilayer perceptron. A one-layer perceptron has N input units and one output unit. Between input and output it has one layer of adjustable weights Wi, (i = 1, ... ,N). The output z is a possibly nonlinear function of the weighted sum of inputs Xi, i.e. z = g(h) , with 1 N h = I1tT L Wi Xi . vN i=l (1) A Realizable Learning Task Which Exhibits Overfitting 219 The quality of the function approximation is measured by the difference between the correct output z* and the net's output z averaged over all possible inputs. In the supervised learning scheme one trains the network using a set of examples ;fll (JL = 1, ... , P), for which the correct output is known. It is the learning task to minimize a certain cost function, which measures the difference between the correct output z~ and the net's output Zll averaged over all examples. Using the mean squared error as a suitable measure for the difference between the outputs, we can define the training error ET and the generalization error Ea as
منابع مشابه
Multi-Task Learning for Improved Discriminative Training in SMT
Multi-task learning has been shown to be effective in various applications, including discriminative SMT. We present an experimental evaluation of the question whether multi-task learning depends on a “natural” division of data into tasks that balance shared and individual knowledge, or whether its inherent regularization makes multi-task learning a broadly applicable remedy against overfitting...
متن کاملUsing representation learning and out-of-domain data for a paralinguistic speech task
In this work, we study the paralinguistic speech task of eating condition classification and present our submitted classification system for the INTERSPEECH 2015 Computational Paralinguistics challenge. We build upon a deep learning language identification system, which we repurpose for general audio sequence classification. The main idea is that we train local convolutional neural network clas...
متن کاملNovaSearch at ImageCLEFmed 2016 Subfigure Classification Task
This paper describes the NovaSearch team participation in the ImageCLEF 2016 Medical Task in the subfigure classification subtask. Deep learning techniques have proved to be very effective in automatic representation learning and classification tasks with general data. More specifically, convolutional neural networks (CNNs) have surpassed humanlevel performance in the ImageNET classification ta...
متن کاملRegularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization
The construction of a suitable set of features to approximate value functions is a central problem in reinforcement learning (RL). A popular approach to this problem is to use high-dimensional feature spaces together with least-squares temporal difference learning (LSTD). Although this combination allows for very accurate approximations, it often exhibits poor prediction performance because of ...
متن کاملFORMULA: FactORized MUlti-task LeArning for task discovery in personalized medical models
Medical predictive modeling is a challenging problem due to the heterogeneous nature of the patients. In order to build effective medical predictive models we need to address such heterogeneous nature during modeling and allow patients to have their own personalized models instead of using a one-size-fits-all model. However, building a personalized model for each patient is computationally expe...
متن کامل